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ABSTRACT 


The  performance  of  the  Nested  Tropical  Cyclone  Model  (NTCM)  for  542  track 
forecasts  in  the  western  North  Pacific  during  1981-1983  is  evaluated  with  respect  to  five 
storm-related  parameters:  intensity,  12-h  change  in  intensity,  latitude,  longitude  and  size. 
This  study  is  intended  to  aid  the  operational  forecaster  in  deciding  when  to  use  the  NTCM 
based  on  storm-related  parameters  at  the  forecast  time.  The  storm-related  parameters  are 
divided  into  three  subsamples  (about  180  in  each)  and  the  forecasts  are  evaluated  in  terms 
of  the  mean  forecast  error,  median  forecast  error  and  systematic  (zonal  and  meridional) 
error.  Cross-track  (CT)  and  along-track  (AT)  components  are  computed  relative  to  a 
CLImatology  and  PERsistence  (CLIPER)  track.  A  scoring  system  (M)  that  assesses 
penalty  points  for  forecasts  in  incorrect  terciles  is  used  to  compare  the  accuracy  of  the 
NTCM  and  CLIPER  forecasts  within  the  subsamples.  For  the  entire  sample,  the  NTCM 
has  a  slow  bias,  especially  at  the  12-  through  36-h  forecast  periods.  It  also  performs  better 
for  storms  with  initial  latitudes  south  of  13*  N  and  initial  longitudes  west  of  129*  E.  For 
very  large  storms,  the  NTCM  forecasts  have  both  left-of-track  and  westward  biases  which 
indicate  problems  of  the  NTCM  in  predicting  recurvature  of  such  systems.  The  NTCM 
(which  has  a  60-kt  bogus)  forecasts  for  storms  with  initial  intensities  between  50  and  75  kt 
have  much  lower  CT/AT  M  scores  and  smaller  forecast  errors  than  the  subsamples  with 
initial  intensities  less  than  50  kt  or  greater  than  75  kt 


TABLE  OF  CONTENTS 


I.  INTRODUCTION . 

II.  THE  NESTED  TROPICAL  CYCLONE  MODEL . 

III.  THE  DATA  SET . 

A.  NTCM,  CLIPER  AND  BEST  TRACK  POSITIONS . 

B.  STORM-RELATED  PARAMETERS . 

IV  ERROR  STATISTICS . 

A.  MEAN  AND  MEDIAN  FORECAST  ERRORS . 

B.  SYSTEMATIC  ERRORS . 

C.  CROSS-TRACK  AND  ALONG-TRACK  ERROR  COMPONENTS 

D.  CONTINGENCY  TABLES,  CLASS  ERRORS  AND  M  SCORES 

V.  RESULTS . 

A.  TOTAL  SAMPLE  STATISTICS . 

B.  LATITUDE  EFFECTS . 

C.  LONGITUDE  EFFECTS . 

D.  INTENSITY  EFFECTS . 

E.  PAST  12-HOUR  INTENSITY  CHANGE  EFFECTS . 

F.  SIZE  EFFECTS . 

VI.  CONCLUSIONS . 

APPENDIX:  CONTINGENCY  TABLES  AND  PERCENTAGE 

OF  CLASS  ERROR  TABLES . 

LIST  OF  REFERENCES . 

INITIAL  DISTRIBUTION  LIST . 


LIST  OF  TABLES 


Means  and  standard  deviations  of  24, 48  and  72  h  CT  and  AT  error 
components  for  a  sample  of  best  track  positions . 26 

Cross-track  contingency  tables,  percentage  of  class  errors  and  M  scores . 35 

As  in  Table  2a,  except  for  along  track . 36 

NTCM  total  sample  (n=542)  percent  class  errors,  M  scores,  systematic, 
mean  and  median  forecast  errors . 39 

CLIPER  systematic,  mean  and  median  forecast  errors  for  the  total  sample . 40 


Cross-track  and  along-track  percent  class  errors  and  M  scores  for  NTCM 


stratified  by  latitude . •. . 43 

Mean,  median  and  systematic  errors  for  NTCM  stratified  by  latitude . 44 

As  in  Table  6  except  for  CLIPER  . 45 

Cross-track  and  along-track  percent  class  errors  and  M  scores  for  NTCM 
stratified  by  longitude . 48 

Mean  median,  and  systematic  errors  for  NTCM  stratified  by  longitude . 49 

As  in  Table  9  except  for  CLIPER . 50 

Cross-track  and  along-track  percent  class  errors  and  M  scores  for  NTCM 
stratified  by  intensity . 53 

Mean  median,  and  systematic  errors  for  NTCM  stratified  by  intensity .  54 

As  in  Table  12  except  for  CLIPER . 55 

Cross-track  and  along-track  percent  class  errors  and  M  scores  for  NTCM 
stratified  by  past  12-h  intensity  change . 59 

Mean  median,  and  systematic  errors  for  NTCM  stratified  by 

past  12-h  intensity  change . 60 

As  in  Table  15  except  for  CLIPER . . . 61 

Cross-track  and  along-track  percent  class  errors  and  M  scores  for  NTCM 
stratified  by  size . 64 

Mean  median,  and  systematic  errors  for  NTCM  stratified  by  size . 65 

As  in  Table  18  except  for  CLIPER . 66 


A- 1 .  Cross- track  contingency  tables,  percent  class  error  tables,  and 

M  scores  for  24-h  NTCM  forecasts  stratified  by  latitude . 

A-2.  As  in  A-l,  except  for  48  h . . 

A-3.  As  in  A-l,  except  for  72  h . 

A-4.  As  in  A-l,  except  for  along-track  24  h . 

A-5.  As  in  A-4,  except  for  48  h . 

A-6.  As  in  A-4,  except  for  72  h . . . 

A-7.  As  in  A-l,  except  for  cross- track  24  h  stratified  by  longitude 

A-8.  As  in  A-7,  except  for  48  h . 

A-9.  As  in  A-7,  except  for  72  h . 

A- 10.  As  in  A-7,  except  for  along-track  24  h . 

A- 11.  As  in  A- 10,  except  for  48  h . . 

A- 12.  As  in  A- 10,  except  for  72  h . . . . 

A- 13.  As  in  A-l,  except  for  cross-track  24  h  stratified  by  intensity.. 

A- 14.  As  in  A- 13,  except  for  48  h . 

A- 15.  As  in  A- 13,  except  for  72  h . 

A- 16.  As  in  A- 13,  except  for  along  track  24  h . 

A- 17.  As  in  A- 16,  except  for  48  h . 

A-18.  As  in  A-I6,  except  for  72  h . 

A-19.  As  in  A-l,  except  for  stratified  by  past  12-h  intensity  change. 

A-20.  As  in  A-19,  except  for  48  h . 

A-21.  As  in  A-19,  except  for  72  h . 

A-22.  As  in  A-19,  except  for  along-track  24  h . 

A-23.  As  in  A-22,  except  for  48  h . 

A-24.  As  in  A-22,  except  for  72  h . 

A-25.  As  in  A-l,  except  for  cross  track  24  h  stratified  by  size . 

A-26.  As  in  A-25,  except  for  48  h . 

A-27.  As  in  A-25,  except  for  72  h . 

A-28.  As  in  A-25,  except  for  along-track  24  h . 

A-29.  As  in  A-28,  except  for  48  h . 

A-30.  As  in  A-28,  except  for  72  h . 


LIST  OF  FIGURES 


Distribution  of  initial  latitudes  with  tercile  outpoints . 

As  in  Fig.  la,  except  for  initial  longitudes . 

As  in  Fig.  la,  except  for  initial  intensities  (kt) . 

As  in  Fig.  la,  except  for  previous  12-h  intensity  change  (kt) . . 

As  in  Fig.  la,  except  for  radii  of  30-kt  winds  (n.mi) . 

Definition  of  forecast  and  systematic  error  components . 

Definition  of  cross-track  and  along-track  error  components . 

Distribution  of  best  track  24  h  CT  error  components . 

Distribution  of  best  track  48  h  CT  error  components . 

Distribution  of  best  track  72  h  CT  error  components . 

Distribution  of  best  track  24  h  AT  error  components . : . 

Distribution  of  best  track  48  h  AT  error  components . 

Distribution  of  best  track  72  h  AT  error  components . 

Locations  of  latitude  and  longitude  cutpoints  in  the  western  North  Pacific 


ACKNOWLEDGMENTS 


I  wish  to  express  my  gratitude  and  appreciation  to  the  following  people  who  provided 
encouragement,  guidance  and  assistance  throughout  this  study.  My  wife  Linda,  whose 
love  and  patience  were  uplifting  influences  that  dampened  the  frustrations  and  setbacks  that 
are  inevitable  in  such  a  project.  Professor  Russell  Elsberry,  who  conceived  the  project  and 
gave  outstanding  guidance,  counseling  and  assistance  essential  to  its  completion.  Dr. 
Johnny  Chan,  who  reviewed  the  manuscript  and  provided  sage  advice  on  both  the  content 
and  form  of  my  writing.  Mr.  Jim  Peak,  who  assisted  in  the  writing  of  the  computer  code 
required  to  compile  the  error  statistics.  I  also  wish  to  thank  Dr.  Ted  Tsui  and  Mr.  Michael 
Fiorino  who  provided  the  CLIPER  and  NTCM  forecast  data,  and  Mr.  Charles  Leonard 
who  manually  entered  the  radius  of  30-kt  winds  into  the  computer  data  base  from  the 
JTWC  warnings.  Without  the  help  of  these  people,  this  study  would  never  have  been 
completed.  It  has  been  my  distinct  pleasure  learning  from  them. 


The  enormous  destructive  potential  of  intense  tropical  cyclones  is  well  known.  The 
high  winds,  heavy  seas  and  torrential  rain  that  accompany  these  systems  have  caused  great 
loss  of  life  and  damage  to  property  at  sea  and  ashore.  Thus,  it  is  not  surprising  that 
accurately  forecasting  the  movement  of  tropical  cyclones  is  of  primary  importance  to 
civilian  and  military  organizations  in  affected  regions.  Recognizing  this,  the  Commander  in 
Chief,  U.S.  Pacific  Command  has  given  an  improved  forecast  capability  the  highest 
priority  for  tropical  cyclone  research  objectives  within  the  Department  of  Defense  (DOD) 
(COMNAVOCEANCOM.1984).  Especially  important  are  long-range  (48-  to  72-h) 
forecasts,  which  are  required  by  operational  commanders  who  must  consider  movement  of 
ships  and  aircraft  to  avoid  damage  to  DOD  assets.  Civilian  authorities  also  need  advance 
warning  to  implement  public  disaster  preparedness  measures.  Noting  this  requirement  for 
increased  accuracy  in  track  forecasting,  the  United  States  Seventh  Fleet  Commander  has 
levied  a  requirement  on  the  Joint  Typhoon  Warning  Center  (JTWC)  in  Guam  to  achieve 
maximum  forecast  errors  of  50,  100,  and  150  nautical  miles  (n.mi.)  for  24,  48,  and  72  h 
respectively. 

During  the  past  decade,  the  rate  of  improvement  in  tropical  cyclone  track  forecasting 
has  not  been  as  rapid  as  hoped.  It  is  generally  accepted  that  a  "plateau"  has  been  reached  in 
the  annual  24-h  forecast  error  statistics  (Elsberry,  1984).  Improvements  in  72-h  forecasts 
have  been  realized,  but  only  in  some  tropical  cyclone  regions  (Thompson,  £ial.,  1981). 
Furthermore,  while  some  components  of  the  tropical  cyclone  warning  system  have  been 
improved,  others  have  been  degraded.  For  example,  the  introduction  and  advancement  of 


satellite  surveillance  techniques  have  not  compensated  for  the  loss  of  data  due  to  the 
reduction  in  conventional  observations  and  in  reconnaissance  flights  (Elsberry,  1984). 

The  most  important  recent  development  has  been  the  implementation  of  new  dynamic 
forecast  models  to  predict  tropical  cyclone  tracks  (Elsberry,  1983).  The  U.S.  Navy 
two-way  interactive  nested  tropical  cyclone  model  (NTCM)  was  originally  developed  by 
Harrison  (1973).  It  has  been  tested  with  operational  data  by  Harrison  (1981),  Harrison 
and  Fiorino  (1982),  Fiorino  fit  al.  (1982)  and  Peak  and  Elsberry  (1984).  These  tests  with 
a  large  number  of  cases  indicate  that  the  NTCM  has  high  potential  for  good  performance  at 
48  and  72  h  (Fiorino,  1985).  However,  problems  with  consistency  in  the  NTCM  tracks 
have  limited  its  value  as  an  operational  forecast  tool.  JTWC  recently  evaluated  NTCM 
track  predictions  in  the  western  North  Pacific  during  the  1984  tropical  cyclone  season  and 
found  that  the  NTCM-predicted  cyclone  movement  averaged  40  percent  less  than  that 
observed  (Sandgathe,  1985).  This  slow  bias  significantly  hampers  the  decision-making 
process  of  the  typhoon  duty  officer  (TDO)  because  the  "decision  points”  in  the  forecast 
track  (recurvature,  etc.)  are  forecast  too  late. 

The  primary  objective  of  this  thesis  is  to  determine  how  storm-related  parameters  affect 
the  NTCM-predicted  track.  This  knowledge  may  provide  valuable  information  to  the 
forecaster  concerning  the  veracity  of  a  particular  NTCM  forecast  based  on  certain 
storm-related  conditions  observed  at  the  time  the  tropical  cyclone  warning  is  issued.  An 
example  in  which  such  knowledge  may  have  been  useful  is  Supertyphoon  Abby  in  1983. 
Fiorino  (1985)  suggests  that  the  NTCM  and  virtually  all  of  the  other  forecast  aids  were 
incorrect  because  Abby  was  such  a  large  storm  (radius  of  30-kt  winds  greater  than  300 
n.mi.).  By  contrast,  Typhoon  Ike  (1984),  a  very  small  storm  (radius  of  30-kt  winds  less 
than  100  n.mi.),  was  also  incorrectly  forecast  by  NTCM.  In  addition  to  size,  the 
storm-related  parameters  of  intensity,  past  12-h  intensity  change,  and  position  are  studied 


to  determine  what  relationships  exist  between  these  parameters  and  the  respective  NTCM 
forecasts.  The  intensity  and  intensity  change  parameters  are  chosen  because  the  NTCM 
includes  a  time-independent  bogus  storm  of  60-kt  intensity  in  the  initial  conditions. 

Knowledge  of  the  NTCM  performance  characteristics  is  essential  in  making  the 
correct  decision  to  accept  or  reject  a  particular  NTCM  forecast.  Such  performance 
characteristics  of  the  NTCM,  based  on  certain  storm-related  parameters,  are  described 
herein.  In  addition,  the  methodology  used  in  this  study,  while  developed  specifically  for 
the  NTCM,  can  (and  should)  be  applied  to  other  objective  tropical  cyclone  forecast  aids. 
Similar  studies  will  be  useful  to  compile  "rules  of  thumb"  for  each  aid  under  various 
storm-related  conditions.  Given  a  set  of  such  rules  and  the  initial  storm-related  parameters, 
a  forecaster  should  be  able  to  make  a  better  and  quicker  evaluation  of  the  relative  merit  of 
the  track  forecasts  from  each  objective  aid.  In  a  broader  context,  the  methodology  of  this 
study  may  be  used  to  provide  the  objective  measures  of  storm-related  or  synopticity  factors 
to  build  a  "decision-tree"  algorithm.  The  "decision-tree"  algorithm  suggested  by  Peak  and 
Elsberry  (1985)  selects  the  objective  aid  that  is  most  appropriate  to  each  forecast  situation, 
based  on  a  large  number  of  synopticity  and  storm-related  factors.  The  tree-structured 
approach  to  forecasting  is  expected  to  reduce  forecast  errors,  improve  training  and 
guidance  for  inexperienced  TDO's,  and  provide  a  detailed  record  of  the  decision  process 
for  post-storm  analysis.  Because  of  the  myriad  of  possible  storm-related  and  synopticity 
factors,  and  the  numerous  existing  objective  aids  to  be  evaluated,  much  more  work  must 
be  done  before  the  "decision-tree"  concept  becomes  an  operational  reality. 
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The  NTCM  was  originally  developed  by  Harrison  (1973)  to  demonstrate  the  concept 
of  grid-nesting  with  two-way  interactive  boundaries.  After  early  tests  of  the  model  had 
shown  considerable  promise  (see  Harrison,  1981),  its  forecasts  have  been  received  on  a 
regular  basis  by  the  JTWC  since  1979.  Different  versions  of  the  NTCM  were  used  in 
subsequent  seasons  as  modifications  were  made  to  decrease  the  model  forecast  errors 
(Fiorino,  1985).  The  forecasts  analyzed  in  this  study  are  from  the  operational  model 
during  1983.  The  1981  and  1982  storms  were  re-run  by  M.  Fiorino  using  this  version  to 
provide  a  homogeneous  data  set 

The  NTCM  is  a  three-layer  model  with  a  nested,  moving  grid  that  provides  high 
resolution  in  the  vicinity  of  the  cyclone  circulation.  The  inner  grid  remains  centered  on  the 
storm  position  as  it  moves  within  the  6600  km  x  4900  km  outer  region.  The  inner  grid  has 
a  1230  km  x  1230  km  domain  with  41  km  resolution.  The  coarse  grid  resolution  is  205 
km,  which  gives  a  five  to  one  reduction  at  the  interface.  The  NTCM  does  not  include 
topographic  effects.  A  simple  analytic  heating  function  centered  on  the  surface  cyclone  is 
used  to  maintain  the  cyclonic  circulation.  The  north-south  boundaries  of  the  outer  grid 
consist  of  free-slip  walls  while  cyclic  continuity  is  assumed  in  the  east- west  direction.  The 
inner  grid  has  two-way  interactive  boundaries  which  allows  cyclone  circulation  in  the  inner 
grid  to  influence  the  environmental  flow  and  vice  versa.  The  model  uses  centered  time 
and  space  differencing  techniques. 

The  NTCM  is  initialized  from  the  global  band  tropical  analysis  fields  generated  by  the 
Fleet  Numerical  Oceanography  Center  (FNOC).  Because  of  the  channel  boundary 
conditions,  the  NTCM  can  be  integrated  independently  of  other  models  or  inputs  following 


initialization  from  the  analysis  fields.  This  feature  is  particularly  desirable  from  the 
standpoint  of  operational  timeliness  (Elsberry,  1979). 

The  NTCM  uses  a  reverse  balance  initialization  technique  for  wind  and  geopotential 
fields  (Harrison  and  Fiorino,  1982).  The  tropical  cyclone  is  simulated  by  a  bogus 
circulation  imposed  on  the  fine  grid  at  the  observed  location  of  the  storm.  The  initial 
intensity  of  the  storm  is  always  60  kt  The  streamfunction  field  is  calculated  from  the 
vorticity  which  is  obtained  from  the  analyzed  wind  Held.  Divergence  is  allowed  in  the 
solution  of  the  nonlinear  balance  equation  for  the  geopotential  height  Held.  The  balanced 
geopotential  values  are  then  interpolated  from  the  coarse  grid  to  the  edge  of  the  fine  grid, 
and  similar  balancing  is  performed  on  the  fine  grid.  Values  at  the  coincident  points  on  the 
fine  grid  are  then  substituted  for  the  interior  of  the  coarse  grid  solution.  The  entire 
initialization  process  is  repeated  two  or  three  times  to  ensure  that  both  grids  have 
converged  to  approximately  the  same  balanced  initial  fields.  Initialization  of  the  coarse  grid 
and  treatment  of  the  input  data  were  modified  for  the  1983  season  (Fiorino,  1985)  to 
improve  the  consistency  between  the  mass  and  wind  fields,  especially  near  the  channel 
boundaries. 

The  basic  philosophy  of  the  model  is  to  provide  good,  long  range  track  predictions  in  a 
timely  manner  for  use  by  an  operational  forecaster.  This  study  is  an  attempt  to  analyze  and 
understand  the  performance  characteristics  of  the  NTCM  as  a  function  of  storm-related 
parameters  using  a  large  data  set  It  will  be  shown  that  the  performance  of  the  model  can 
be  related  to  the  values  of  these  parameters  so  that  an  operational  forecaster  can  use  this 
information  to  help  decide  whether  or  not  to  use  the  NTCM. 


A.  NTCM,  CLIPER  AND  BEST-TRACK  POSITIONS 

The  position  data  set  consists  of  542  tropical  cyclone  cases  from  the  western  North 
Pacific  during  1981,  1982  and  1983  in  which  track  forecasts  are  available  up  to  72  h. 
These  data  include  the  NTCM  and  the  western  North  Pacific  CLImatology  and  PER- 
sistence  model  (CLIPER)  forecasts  as  well  as  the  verifying  best- track  positions  in  12-h 
increments  for  all  542  cases.  The  data  set,  kindly  provided  by  Mr.  Michael  Fiorino  of  the 
Naval  Environmental  Prediction  and  Research  Facility  (NEPRF),  represents  the  largest 
homogeneous  data  set  used  to  analyze  the  performance  of  the  NTCM.  Even  so,  the  542 
cases  represent  only  about  one-fourth  of  the  approximately  2200  tropical  cyclone  warnings 
issued  in  this  region  from  1981  through  1983  .  The  reason  for  this  is  twofold: 

1.  The  NTCM  was  run  only  once  every  12  h  for  seasons  1981and  1982  (every  6h 
for  1983),  whereas  the  JTWC  issues  warnings  every  6h;  and 

2.  All  NTCM  forecasts  without  verifying  positions  to  72  h  were  excluded. 

The  72-h  CLIPER  forecasts,  also  provided  by  M.  Fiorino,  were  run  for  the  same 
cases  as  the  NTCM.  The  resultant  data  set  is  homogeneous  since  the  NTCM  and 
CLIPER  models  have  track  predictions  to  72  h  for  each  of  the  542  cases  and  verifying 
data  (best  track)  are  available  for  each  forecast  position. 

The  western  North  Pacific  CLIPER,  which  was  developed  by  Xu  and  Neumann 
(1985),  uses  regression  equations  to  relate  future  storm  positions  to  initial  position,  past 
12-  and  24-h  positions,  initial  intensity,  and  Julian  date.  The  equations  were  derived  for 
storms  south  of  35*N  and  west  of  150°E  which  occurred  during  the  months  of  May 
through  December.  The  forecasts  to  24  h  rely  heavily  on  persistence,  and  more  on 


climatology  at  the  48-  and  72-h  forecast  periods.  The  CLIPER  track  is  selected  as  a 
reference  in  calculating  the  cross-track  (CT)  and  along-track  (AT)  error  components  for 
both  the  NTCM  and  best-track  positions  (see  chapter  IV).  The  reason  for  using  CLIPER 
is  that  it  is  a  statistical  forecast  scheme  that  should  be  free  of  any  significant  bias  with 
respect  to  the  actual  storm  track. 


B.  STORM-RELATED  PARAMETERS 

The  storm  latitude,  longitude,  intensity,  previous  12-h  change  in  intensity  and  radius 
of  30-kt  winds  are  selected  as  the  storm-related  parameters  to  be  used  as  predictors.  The 
data  are  taken  from  the  JTWC  warnings  and  correspond  to  the  initial  times  of  the  542 
NTCM  and  CLIPER  forecasts.  These  five  parameters  are  chosen  for  two  reasons.  First, 
when  taken  from  the  JTWC  warnings,  they  represent  the  real-time  data  that  are  available  to 
the  TDO  at  the  time  the  NTCM  is  run.  Second,  these  storm-related  parameters  are 
expected  to  have  some  degree  of  influence  on  the  future  storm  track  (Elsberry,  1984). 

The  samples  of  each  of  the  five  storm-related  parameters  are  partitioned  into 
equal-sized  terciles.  The  cutpoints  between  the  terciles  are  then  used  to  segregate  the 
corresponding  sample  of  NTCM  and  CLIPER  forecasts  into  three  subsamples.  Various 
error  statistics  (see  chapter  IV)  are  computed  for  each  subsample  of  forecasts  and  examined 
to  determine  differences  in  NTCM  forecast  performance.  The  histograms  for  each  of 
these  parameters  (with  the  locations  of  the  tercile  cut  points)  are  provided  in  Figs,  la-le. 

The  distribution  of  initial  latitudes  for  the  sample  (Fig.  la)  is  slightly  skewed  with 
maximum  frequencies  near  the  lower  cutpoint  (between  12*N  and  13*N)  and  the  mean 
latitude  (15.5*N)  near  the  upper  cutpoint  (between  16’N  and  17*N).  There  are  183,  177 
and  182  cases  for  the  "southern",  "central"  and  "northern"  areas.  In  the  histogram  of 
initial  longitude  (Fig.  lb),  the  lower  cutpoint  is  between  128*E  and  129*E  and  the  upper 
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cutpoint  between  139*E  and  140*E.  The  distribution  of  initial  longitudes  also  appears 
slightly  skewed,  with  the  maximum  frequency  near  the  lower  cutpoint  There  are  169, 186 
and  187  cases  in  the  "western",  "middle"  and  "eastern"  areas. 

The  histogram  of  initial  intensities  (Fig.  lc)  is  skewed  toward  the  lower  intensities. 
The  width  of  the  cells  in  the  histogram  is  5  kt  because  intensities  on  the  JTWC  warnings 
are  issued  in  5-kt  increments.  The  outpoints,  which  are  located  between  45  and  50  kt  and 
between  75  and  80  kt,  divide  the  data  into  subsamples  which  shall  be  referred  to  as 
"weak",  "moderate"  and  "intense"  tropical  cyclones.  The  number  of  cases  in  each 
subsample  is  182,  182  and  178  respectively.  The  histogram  of  the  previous  12-h  intensity 
change  can  be  separated  into  "weakening",  "developing"  and  "rapidly  developing" 
subsamples  using  the  outpoints  between  0  and  5  kt  and  between  10  and  15  kt  (Fig.  Id). 
The  number  of  cases  in  these  subsamples  are  190, 169  and  99,  respectively.  The  sample 
can  not  be  partitioned  equally  because  the  majority  of  the  cases  falls  into  just  a  few  of  the 
cells,  and  the  cells  can  not  be  smaller  than  5  kt  of  12-h  intensity  change.  The  size  of  the 
sample  is  consequently  reduced  to  458  because  the  intensity  differences  can  not  be 
computed  for  the  first  warning  of  a  tropical  cyclone. 

Noticeable  "spikes"  in  the  histogram  of  the  radii  of  30-kt  winds  (Fig.  le)  occur  at  30, 
100, 150  and  300  n.mi.  When  a  warning  gives  two  semicircles  of  wind  radii,  the  larger  of 
the  two  is  used.  In  addition,  tropical  cyclones  £  30  kt  are  assigned  a  radius  of  30  n.mi. 
The  radius  of  30-kt  winds  is  often  rather  subjective  as  peripheral  data  from  aircraft 
reconnaissance  may  not  be  available.  These  data  were  manually  extracted  by  Mr.  Charles 
Leonard  of  the  Department  of  Meteorology  at  NPS  from  over  2200  warning  messages 
issued  by  JTWC.  The  outpoints  are  located  between  105  and  1 10  n.mi.  and  between  205 
and  210  n.mi.,  which  separates  the  sample  into  "small",  "medium"  and  "large"  tropical 
cyclones.  The  number  of  cases  in  the  three  subsamples  are  186, 181  and  175  respectively. 
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A.  MEAN  AND  MEDIAN  FORECAST  ERRORS 


A  measure  of  accuracy  commonly  used  for  tropical  cyclone  track  forecasts  is  the 
"forecast  error",  which  is  defined  as  the  great  circle  distance  between  the  forecast  and 
verifying  position  (Fig.  2).  The  mean  forecast  error  is  simply  the  sum  of  the  errors 
divided  by  the  number  in  the  sample. 


12*  N 


11*  N 


10*  N 


Figure  2.  Definition  of  forecast  and  systematic  error  components  (AX  and  AY). 
In  this  example,  both  AX  and  AY  are  negative. 


Because  the  distribution  of  the  forecast  errors  in  a  sample  is  bounded  on  one  side  by 
zero  and  unbounded  on  the  other,  many  studies  use  the  "median  forecast  error"  which  is 
the  value  of  the  50^  percentile  in  the  distribution. 

The  mean  and  median  forecast  errors  at  12,  24,  36,  48,  60  and  72  h  for  the  NTCM 


and  CLDPER  (verified  relative  to  best-track  positions)  are  computed  for  the  total  sample 
(542  cases)  and  for  each  subsample  stratified  by  different  values  of  storm-related 
parameters.  The  unit  used  for  these  and  all  other  error  components  is  kilometer  (km). 


B.  SYSTEMATIC  ERRORS 

Another  measure  of  error  for  tropical  storm  track  forecasts  is  the  systematic  error.  The 
systematic  error  components,  IX  and  £Y,  are  simply  the  zonal  (AX)  and  meridional  (AY) 
errors  averaged  over  the  sample  of  forecasts  (Fig.  2).  The  error  components  are  calculated 
for  each  12-h  forecast  period  to  72  h.  The  systematic  error  components  are  useful  in 
determining  the  presence  (or  absence)  of  an  error  bias  in  the  sample.  For  example,  a 
monotonic  increase  or  decrease  throughout  the  forecast  period  indicates  a  systematic  error 
which  might  be  statistically  removed  (Peak  and  Elsberry,  1982).  The  sign  convention  for 
this  study  is  positive  if  the  forecast  position  is  north  (+£Y)  or  east  (+ZX)  of  the  best-track 
position.  The  results  of  the  systematic,  mean  and  median  error  statistics  for  the  NTCM 
and  CLEPER  samples  are  discussed  in  chapter  V  (Tables  3  and  4). 

C.  CROSS-TRACK  AND  ALONG-TRACK  ERROR  COMPONENTS  RELATIVE  TO 

EXTRAPOLATED  CLIPER  FORECASTS 

Forecast  errors  are  also  presented  as  cross-track  (CT)  and  along-track  (AT) 
components.  The  objective  of  the  CT/AT  system  is  to  provide  information  to  the 
forecaster  about  the  movement  and  direction  of  the  storm  relative  to  a  standard  forecast  aid 
such  as  persistence  or  climatology  (Elsberry  and  Peak,  1986).  The  mean  and  median 
forecast  errors  give  only  the  magnitude  of  the  error  relative  to  the  actual  position  and  the 
systematic  error  gives  the  average  of  the  zonal  and  meridional  error  components.  On  the 
other  hand,  the  CT/AT  errors  also  provide  information  about  the  direction  of  the  forecast  in 
a  storm-oriented  reference  frame.  Elsberry  and  Peak  (1986)  evaluated  tropical  cyclone  aids 
based  on  CT  and  AT  components  relative  to  an  extrapolated  track  based  on  warning 
positions  at  the  initial  (00)  and  past  12-h  time  periods.  They  interpreted  the  CT 
components  as  turning  motion  and  the  AT  components  as  acceleration  or  deceleration. 
This  directionality  aspect  gives  important  information  to  the  forecaster  that  is  not  available 
from  the  other  error  measures. 


The  CT/AT  scheme  used  in  this  study  differs  from  that  of  Elsberry  and  Peak  (1986)  in 
that  the  CT  and  AT  components  for  the  NTCM  or  best-track  positions  for  each  forecast 
period  (24,  48, 72  h)  are  calculated  relative  to  the  CLIPER  forecast  at  the  corresponding 
time.  For  example,  the  CT/AT  at  72  h  is  calculated  relative  to  a  line  connecting  the  72  and 
60  h  CLIPER  positions  (Fig.  3). 


NTCM  or  Best  Track 


Figure  3.  Definition  of  cross-track  (CT)  and  along-track  (AT)  components  at  72  h 
relative  to  an  extrapolated  track  based  on  CLIPER  positions  at 
72  and  60  h.  In  this  example,  CT  is  positive  (right)  and  AT  is 
negative  (slow)  with  respect  to  the  CLIPER  track. 


The  perpendicular  distance  from  the  NTCM  or  best-track  position  to  the  extrapolated 
track  is  the  cross-track  component,  with  positive  values  to  the  right  of  the  track  and 
negative  to  the  left  The  distance  along  the  extrapolated  track  from  the  CLIPER  position  to 
the  perpendicular  from  the  NTCM  or  best-track  position  is  the  along-track  component. 
Positive  (negative)  AT  values  occur  if  the  perpendicular  meets  the  track  ahead  (behind)  the 
corresponding  CLIPER  position. 


The  CT/AT  components  of  the  best  track  are  computed  for  the  entire  best-track  sample 
at  24-,  48-  and  72-h  forecast  periods.  The  means  and  standard  deviations  of  the 
distributions  for  each  forecast  period  are  shown  in  Table  1. 


Table  1 

Means  (x)  and  standard  deviations  (c)  of  the  24-,  48-  and  72-h  CT  and 
AT  components  (km)  for  the  total  sample  of  best-track  positions 
(relative  to  CLIPER  forecasts). 
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-22 

164 

-29 

340 

-41 

574 

AT 

-66 

165 

-179 

377 

-276 

594 

Notice  that  the  mean  values  of  the  24-,  48-hand  72-h  CT  errors  are  all  very  close  to 
zero,  which  indicates  that  the  best-track  CT  components  are  not  biased  with  respect  to  the 
CLIPER  track.  This  result  is  not  surprising  because  a  statistical  scheme  such  as  CLIPER 
should  have  no  bias  relative  to  the  overall  mean  position.  It  can  also  be  seen  that  the 
standard  deviation  increases  with  time.  The  symmetric  properties  of  the  CT  sample  are 
evident  in  the  histograms  for  the  samples  of  the  three  time  periods  (Figs.  4a-c).  The  tercile 
cutpoints  are  indicated  on  the  histograms  by  dashed  lines.  The  cutpoints  of  the  24-h  CT 
distribution  (Fig.  4a)  are  at  -75  km  and  50  km,  which  is  almost  exactly  centered  about  the 
mean  (-22  km).  The  48-h  CT  (Fig.  4b)  cutpoints  are  at  -125  km  and  125  km,  and  are  also 
symmetric  about  the  mean.  The  same  properties  can  be  seen  in  the  72-h  sample  (Fig.  4c), 
which  has  cutpoints  at  -200  km  and  200  km.  The  nearly  symmetric  distribution  of 
best-track  CT  error  components  around  the  mean  CLIPER  track  supports  the  use  of 
CLIPER  as  a  referencing  system  because  it  is  more  likely  to  provide  an  orientation  with 
respect  to  the  mean  track  of  the  tropical  cyclone.  The  terciles  have  been  labeled  left  (L), 
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center  (C)  and  right  (R)  according  to  the  distributions  of  the  best-track  CT  error 
components  for  the  24-,  48-  and  72-h  distributions.  These  three  (L,  C  and  R)  categories 
are  used  to  compare  NTCM  forecasts  to  the  best-track  positions. 

The  AT  distributions  exhibit  characteristics  similar  to  those  of  the  CT  distributions 
discussed  above.  The  values  of  the  standard  deviation  for  the  AT  distributions  (Table  1) 
are  very  close  to  those  of  the  CT  for  all  three  time  periods.  The  AT  histograms  (Figs. 
5a-c)  resemble  the  CT  histograms  (Figs.  4a-c)  in  that  they  are  also  very  symmetric  about 
the  mean.  As  with  the  CT  error  components,  the  terciles  are  marked  on  Figs.  5a-c  and 
have  been  named  to  indicate  the  position  with  respect  to  the  extrapolated  CLIPER  track: 
slow  (S) ,  center  (C)  and  fast  (F).  However,  the  negative  mean  (  x  )  values  (Table  1)  of 
-66,  -179,  and  -276  km  indicate  that  best-track  positions  are  consistently  "slow"  with 
respect  to  the  extrapolated  CLIPER  track  (or,  that  CLIPER  is  "fast"  compared  to  the 
best-track).  This  results  from  the  fact  that  given  the  same  initial  position  and  identical 
speed  of  movement,  any  deviation  in  direction  of  movement  from  the  reference  (  past  12  h 
extrapolated  CLIPER)  track  will  produce  an  apparent  "slow"  AT  error  component  This  is 
one  of  the  shortcomings  of  attempting  to  define  a  storm-oriented  coordinate  system 
(Neumann  and  Pelissier,  1981). 

The  primary  advantage  in  using  the  CLIPER  forecast  rather  than  an  extrapolated  track 
from  warning  and  -12  h  positions  (as  was  done  in  Elsberry  and  Peak,  1986)  as  the 
reference  for  CT/AT  components  is  that  it  appears  to  be  an  excellent  storm-oriented 
coordinate  system.  This  is  especially  true  at  the  48-h  and  72-h  forecast  periods.  A  track 
extrapolated  from  warning  and  12-h  old  positions  is  very  representative  of  storm 
movement  for  the  early  (12-  to  24-h)  forecast  periods,  but  not  so  of  the  later  (48-  to  72-h) 
forecast  periods.  Compared  to  simple  extrapolation,  the  inclusion  of  climatology  in  the 
method  described  above  provides  a  better  CT/AT  frame  of  reference  at  all  forecast  periods 
because  it  is  evidently  more  representative  of  the  true  storm  track  at  all  time  periods. 
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except  for  72-h  AT  error  components. 


D.  CONTINGENCY  TABLES,  CLASS  ERRORS  AND  M  SCORES 


After  division  of  the  best-track  CT  and  AT  components  into  terciles,  a  scoring  system 
that  assesses  penalty  points  for  forecasts  that  fall  into  the  incorrect  tercile  is  used  to  rank 
the  NTCM.  The  NTCM  forecasts  are  also  divided  into  terciles  and  each  forecast 
compared  to  the  tercile  for  the  best  track.  A  forecast  is  defined  as  having  a  zero-class  error 
if  it  falls  into  the  same  tercile  as  the  best  track,  a  one-class  error  if  it  is  in  a  tercile  adjacent 
to  that  of  the  best  track  and  a  two-class  error  if  it  is  two  terciles  away  from  the  best  track. 
Contingency  tables  for  the  CT  and  AT  components  are  then  formed  at  each  of  the  forecast 
intervals  (24, 48  and  72  h)  as  shown  in  Tables  2a  and  2b. 

The  upper  portion  of  Table  2a  gives  the  contingency  tables  for  the  NTCM  for  all  three 
time  periods.  The  cutpoints  that  define  the  tercile  boundaries  (see  also  Figs.  4a-c)  are 
indicated  just  below  the  contingency  tables.  The  zero-class  errors  are  arranged  in  the  bins 
located  along  the  upper-left  to  lower-right  diagonal.  The  two-class  errors  are  located  in  the 
upper-right  and  lower-left  bins  and  the  remaining  bins  contain  the  one-class  errors.  A 
higher  number  in  the  zero-class  diagonal  relative  to  the  one-  and  two-class  error  bins 
indicates  a  greater  skill  level.  For  example,  the  total  number  of  zero-class  CT  errors  for  the 
48-h  time  period  is  280,  or  slightly  more  "hits"  than  at  either  24  h  (236)  or  72  h  (265). 
The  totals  column  on  the  right  side  of  each  contingency  table  indicates  the  number  in  each 
best-track  tercile  (L,  C  and  R).  Similarly,  the  totals  along  the  bottom  row  of  each 
contingency  table  show  the  number  of  NTCM  forecasts  that  fall  into  the  best-track  L,  C 
and  R  categories.  Notice  that  fewer  NTCM  forecasts  fall  into  the  "R"  category  at  48  and 
72  h  (123  and  130)  than  the  best-track  (178  and  189),  but  the  number  of  NTCM  forecasts 
in  the  "R"  category  at  24  h  (171)  is  very  close  to  the  best  track  (186).  This  indicates  the 
NTCM  has  a  left  bias  in  the  later  forecast  periods,  but  none  at  the  24-h  period. 


Totals  51.6  38.9  9.5  HTouis  48.9  39.9 


The  lower  half  of  Table  2a  contains  the  percentages  of  NTCM  zero-,  one-  and 
two-class  errors  for  each  (L,C,R)  best-track  tercile  and  the  totals.  The  percentages  provide 
information  about  the  general  distribution  of  errors.  For  example,  notice  that  at  72  h,  a 
higher  percentage  of  two-class  errors  occur  when  the  best  track  is  in  the  "R"  tercile  (22.7) 
than  in  the  "L"  tercile  (10.6).  This  indicates  that  at  72  h,  the  NTCM  is  more  than  twice  as 
likely  to  be  left  of  the  best  track  when  there  is  a  two-class  error. . 

Table  2b  is  similar  to  2a,  but  contains  the  AT  contingency  tables  (S,  C,  and  F 
categories).  The  highest  number  of  zero-class  errors  is  in  the  72-h  period  (286).  Notice 
that  the  number  of  NTCM  forecasts  that  fell  into  the  slow  (S)  categories  for  all  three  time 
periods  is  very  high.  This  agrees  with  the  observation  by  Sandgathe  (1985)  that  the 
NTCM  movement  is  on  the  average  40%  less  than  the  observed  cyclone  movement 

A  primary  motivation  for  the  tercile  pattern  separation  into  contingency  tables  is  to 
determine  if  the  NTCM  correctly  distinguishes  between  left-turning  and  right-turning  as 
well  as  slow  and  fast  storms  (Elsbeny  and  Peak,  1986).  The  lower  portions  of  Tables  2a 
and  2b  summarize  the  percentage  of  class  errors  for  each  category  (L,  C  and  R  or  S,  C  and 
F)  of  the  sample.  For  example,  in  the  72-h  portion  of  Table  2a,  180  of  the  storms  moved 
to  the  left  of  the  CLIPER  track  (total  of  first  row).  Of  these,  103  (57.2%)  are  forecast 
correctly  by  the  NTCM,  58  (33.2%)  are  forecast  to  be  in  the  center  tercile  (one-class  error) 
and  19  (10.6%)  in  the  right-turning  tercile  (two-class  error).  The  percent  of  each  class  of 
errors  for  the  best-track  terciles  (L,  C,  R  or  S,  C,  F)  and  the  total  sample  are  tabulated 
below  the  contingency  tables.  In  the  above  example  for  72  h,  the  "totals"  row  shows  that 
the  48.9%  ,  39.9%  and  1 1.2%  of  the  NTCM  forecasts  for  CT  were  in  the  zero-,  one-  and 
two-class  error  categories,  respectively.  For  comparison  purposes,  a  purely  random 
selection  would  have  percentages  of  33.3%,  44.4%  and  22.2%,  respectively.  Thus,  the 
NTCM  is  more  skillful  than  a  random  forecast  for  this  sample. 


A  further  distillation  of  the  information  contained  in  the  contingency  tables  is  made  as 
an  aid  to  compare  quantitatively  the  performance  of  forecasts.  Preisendorfer  and  Mobley 
(1982)  devised  a  scoring  system  to  represent  the  level  of  skill  in  a  forecast  as  a  single 
number  (M)  defined  as 

M  =  V  +  2W,  (1) 

where  (U,V,W)  are  the  percentages  of  (zero-,  one-,  two-class)  errors  such  that 

U  +  V  +  W  =  100.  (2) 

The  quantity  M  is  simply  a  linear  penalty  score  according  to  the  error  class;  the  lower  the  M 
score,  the  higher  the  degree  of  skill.  In  the  example  used  above  (Table  2a),  the  M  score 
for  the  NTCM  at  72  h  for  the  CT  component  is  62.1.  A  random  tercile  selection  would 
have  an  M  score  of  88.9.  Therefore,  the  M  score  also  indicates  that  the  NTCM  is  more 
skillful  than  a  random  forecast. 

An  M  score  for  the  CLIPER  is  suggested  as  another  standard  of  comparison.  Because 
the  terciles  are  defined  relative  to  CLIPER,  the  CLIPER  forecast  track  will  always  be  in  the 
center  tercile.  Thus,  there  can  never  be  more  than  a  one-class  error.  However,  the  terciles 
are  constructed  so  that  for  both  the  CT  and  the  AT  distributions  66.7%  of  the  cases  are  not 
in  the  center  tercile.  The  CLIPER  forecast  will  always  fail  by  one  class  in  these  cases. 
Thus,  the  M  score  is  simply  66.7  for  both  the  CT  and  the  AT  components.  For  the  total 
sample  (see  Tables  2a  and  2b),  the  CT/AT  M  scores  for  the  NTCM  at  48  h  (58.0/61.4)  and 
72  h  (62.1/58.1)  indicate  that  the  NTCM  is  more  skillful  than  CLIPER  at  the  later  forecast 
periods.  However,  the  24-h  CT/AT  M  scores  (68.3/70.7)  indicate  that  the  NTCM  is 
essentially  a  no-skill  forecast  at  this  time  period. 


V.  RESULTS 


A.  TOTAL  SAMPLE  STATISTICS 

The  CT  and  AT  percentages  of  class  errors,  M  scores,  mean  and  median  errors  and 
systematic  errors  for  the  total  NTCM  sample  are  summarized  in  Table  3.  The  CT  M  scores 
(68.3,  58.0  and  62.1  at  24,  48  and  72  h)  suggest  that  overall,  the  NTCM  forecasts  are 
more  skillful  at  48  and  72  h  than  at  24  h.  The  AT  M  scores  (70.7,  61 .4  and  58. 1  at  24,  48 
and  72  h)  also  indicate  a  similar  result.  Also,  the  NTCM  performs  better  than  the  CLIPER 
(M=66.7)  at  these  time  periods.  However,  the  24-h  M  scores  of  the  CT  and  AT 


IX  IY  Mn  Md 


TABLE 3 

NTCM  total  sample  (542  cases)  percent  class  errors  and  M  scores  (left). 
Systematic,  mean  and  median  forecast  errors  (right). 

%0  I  %\  I  %2  I  M 


CT  43.5  44.7  11.8  68.3 

24  h 

AT  44.3  40.6  15.1  70.8 


CT  51.7  38.7  9.6  57.9 

48  h - 

AT  49.4  39.5  11.1  61.7 


CT  48.9  39.7  11.4  62.5 

72  h - 

AT  52.8  36.3  10.9  58.1 
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48  h 
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397 

355 

60  h 

9 

-3 

508 

453 

72  h 

H 

-9 

626 

565 

it  can  be  seen  that  this  high  percentage  is  due  to  a  large  number  of  two-class  errors  in  the 
lower-left  comer  of  the  table  (68).  This  indicates  that  the  NTCM  has  a  slow  bias, 
especially  at  the  24-h  period. 

The  mean  (Mn)  and  median  (Md)  forecast  errors  for  the  overall  sample  of  NTCM 
forecasts  (Table  3)  and  the  CLIPER  (Table  4)  suggest  that  the  NTCM  performance  is 
generally  no  better  than  CLIPER  at  the  early  (12-  and  24-h)  time  periods.  However,  the 
NTCM  consistently  has  lower  forecast  errors  at  the  later  (36-through  72-h)  periods.  For 
this  sample  of  forecasts,  the  CT/AT  error  statistics,  which  measure  forecasting  skill  based 
on  "storm-motion"  coordinates,  are  in  good  agreement  with  the  forecast  error  statistics, 
which  account  only  for  the  distance  between  the  forecast  and  the  best-track  position. 


TABLE 4 

CLIPER  systematic  (XX  andXY),  mean  (Mn)  and  median  (Md) 
forecast  errors  (km)  for  the  total  sample  (542  cases). 


XX 

XY 

-6 

21 

3 

47 

22 

73 

41 

96 

48 

115 

56 

121 
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storms,  these  positive  values  suggest  that  the  NTCM  is  "slow"  during  the  early  forecast 
periods.  Most  of  the  storms  in  this  sample  will  have  a  component  toward  the  west  because 
of  the  requirement  that  a  complete  72-h  track  be  included.  This  will  tend  to  reduce  the 
number  of  the  eastward-moving  storms  that  tend  to  undergo  extra  tropical  transition  prior  to 
72  h.  Notice  that  in  the  48-  to  72-h  time  period,  the  values  of  LX  decrease  from  17  to  -7, 
which  indicates  that  the  average  NTCM  position  becomes  slightly  west  of  the  best-track 
position  at  72  h.  However,  this  error  is  very  small  compared  with  the  mean  and  median 
forecast  errors.  The  meridional  (LY)  components  of  the  systematic  error  of  the  NTCM  arc 
also  negligible.  In  fact,  the  largest  deviation  from  zero  at  48  h  is  only  16  km  south  of 
best-track  latitude  (Table  3),  which  is  well  within  the  "noise". 

The  CLEPER  systematic  errors  (Table  4)  indicate  that  the  average  forecast  positions  are 
generally  east  and  north  of  the  best  track,  although  these  systematic  errors  are  not  large, 
near  zero  values  had  been  expected.  This  seems  to  suggest  that  this  sample  from  1981-3 
had  somewhat  different  characteristics  than  the  sample  used  to  create  the  CLIPER 
algorithm. 

B.  LATITUDE  EFFECTS 

As  indicated  in  Fig.  la,  the  sample  of  NTCM  forecasts  is  divided  into  southern 
(latitudes  <  13*  N),  central  (between  13*  and  17’  N)  and  northern  ( >  17°  N)  samples.  The 
locations  of  the  latitude  and  longitude  (section  C)  tercile  cutpoints  are  shown  in  Fig.  6. 

Two  obvious  points  arise  from  an  inspection  of  the  M  scores  of  the  latitude-stratified 
subsample  (Table  5).  First,  the  M  scores  of  the  48-  and  72-h  CT  components  for  the 
southern  area  are  much  lower  than  those  for  the  central  and  northern  areas.  This  suggest 
that  the  NTCM  is  more  skillful  in  forecasting  the  direction  of  storm  movement  for  systems 
with  initial  positions  south  of  13°  N.  Second,  both  CT  and  AT  M  scores  indicate  that  the 
NTCM  has  less  skill  in  forecasting  direction  and  speed  at  24  h  than  at  48  h  and  72  h  for  all 


Locations  of  latitude  and  longitude  tercile  cutpoints  in  the  western  North  Pacific  (thick  lines). 
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three  subsamples.  This  result  can  also  be  seen  in  the  forecast  error  statistics  (Table  6), 
which  indicate  that  the  NTCM  has  higher  mean  and  median  24-h  forecast  errors  in  the 
southern  and  central  areas  than  CLIPER  (Table  7). 

Although  the  CT  and  AT  M  scores  generally  decrease  with  increasing  forecast  period, 
the  AT  M  scores  for  the  northern  subsample  are  an  exception.  The  24-h  score  is  very  low 
(56.0)  with  respect  to  that  for  the  total  sample  (70.8),  and  the  M  score  increases  slightly  to 
58.2  at  72  h.  This  seems  to  indicate  that  the  slow  bias  of  the  NTCM  (mentioned  above)  is 
less  pronounced  for  storms  with  initial  latitudes  north  of  17*N.  Inspection  of  the 
contingency  table  (Table  A-4,  appendix)  indicates  that  the  number  of  two-class  errors  in 
the  slow  category  for  the  northern  subsample  (1 1)  is  much  less  than  those  for  the  southern 
(28)  and  central  (29)  subsamples.  In  addition,  the  NTCM  median  24  h  forecast  error 
(Table  6)  for  the  northern  area  is  much  smaller  than  those  for  the  southern  and  central 
areas  (165  km  versus  235  and  198  km,  respectively).  This  24-h  median  forecast  error  is 
even  slightly  smaller  than  that  of  CLIPER  (175  km,  see  Table  4)  for  the  total  sample. 
Therefore,  the  slow  bias  of  the  NTCM  at  24  h  is  largely  due  to  the  storms  initially  south  of 
17*  N.  This  initial  slow  bias  probably  contributes  to  increased  forecast  errors  at  48  and  72 
h  because  it  leads  to  an  incorrect  timing  of  recurvaturc  (Sandgathe,  1985).  Missing  the 
time  of  recurvature  can  produce  large  forecast  errors.  Although  the  AT  errors  for  the 
northern  area  are  quite  small,  the  large  CT  errors  seem  to  offset  them  at  the  48  and  72  h 
time  periods. 

Notice  that  the  NTCM  mean  and  median  forecast  errors  at  72  h  (Table  6)  for  the 
northern  subsample  (644  and  578  km)  are  greater  than  those  of  the  CLIPER  (Table  7)  for 
this  subsample  (633  and  537  km),  which  is  consistent  with  the  NTCM  CT  M  score  at  72  h 
(75.3)  being  much  higher  than  that  of  the  CLIPER  (66.7).  Therefore,  the  NTCM  is  no 
more  skillful  than  CLIPER  for  storms  north  of  17°  N,  even  at  the  48-  and  72-h  periods. 
Only  in  the  southern  subsample  does  the  NTCM  clearly  outperform  CLIPER  at  48  and 


72  h  with  respect  to  all  of  the  error  statistics;  CT,  AT  and  mean  and  median  forecast  errors 
(see  Tables  6  and  7). 


One  explanation  of  the  apparently  good  performance  for  NTCM  in  terms  of  the  CT 
errors  (especially  at  48  and  72  h)  in  the  southern  area  may  be  that  the  synoptic  features  that 
cause  recurvature  are  less  likely  to  extend  into  this  region  (south  of  13*  N).  Therefore,  the 
lack  of  recurvature  influences  on  the  storm  tracks  probably  contribute  to  the  low  CT  M 
scores  at  72  h  (50.3  for  the  southern  area  versus  61.6  and  75.8  for  the  central  and  northern 
areas). 

The  systematic  errors  of  the  NTCM  (Table  6)  indicate  that  the  meridional  (£Y) 
averages  for  all  three  areas  are  very  close  to  zero  and  show  no  systematic  change  with 
increasing  forecast  period.  However,  the  central  area  exhibits  an  increase  in  zonal  (XX) 
error  from  62  km  to  200  km  from  12  to  72  h,  which  indicates  that  the  NTCM  forecasts  are 
east  of  the  best  track.  Conversely,  the.northem  area  zonal  error  decreases  from  33  km  to 
-210  km  throughout  the  period,  with  the  NTCM  becoming  farther  west  of  the  best  track. 
The  absence  of  such  large  systematic  errors  in  the  southern  area  is  consistent  with  the 
other  error  statistics,  which  suggests  that  the  NTCM  performs  best  for  storms  initially 
south  of  13*  N. 

C.  LONGITUDE  EFFECTS 

The  cutpoints  for  dividing  the  sample  of  forecasts  into  western,  middle  and  eastern 
areas  are  129*  E  and  140*  E  (Figs,  lb  Fig.  6).  The  lowest  CT  M  scores  for  the  NTCM  are 
found  in  the  western  area  (Table  8).  This  is  due  to  a  low  percentage  of  two-class  errors  in 
the  western  area  for  all  three  time  periods  (3.5,  2.4  and  5.3%  for  24,  48  and  72  h).  The 
contingency  tables  (Tables  A-7,  A-8  and  A-9)  also  do  not  indicate  any  left  or  right  bias  of 
the  NTCM  in  the  western  area.  Although  the  CT  M  scores  at  24  and  48  h  are  very  low 
(50.8  and  43.8)  for  the  western  area,  the  corresponding  AT  M  scores  are  higher 


Subsample  =  169  No.  in  Subsample  =  186  No.  in  Subsample 
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(76.9  and  67.4)  than  those  in  the  middle  and  eastern  areas.  This  offsetting  effect  degrades 
the  overall  performance  of  the  NTCM.  Research  is  required  to  improve  the  NTCM  so  that 
it  has  low  M  scores  in  both  components. 

The  eastern  area  has  the  next  lowest  CT  M  scores,  which  decrease  from  74.8  to  61.5 
to  54.5  at  24, 48  and  72  h.  The  72-h  value  is  even  slightly  lower  than  the  corresponding 
western  area  CT  M  score.  The  highest  CT  M  scores  are  found  in  the  middle  area  (77.4, 
67.1  and  74.7  at  24,  48  and  72  h).  The  CT  performance  for  this  longitude  band  is  less 
skillful  than  CLIPER  (M  =  66.7)  at  all  forecast  periods. 

Except  for  the  very  poor  AT  performance  in  the  western  area  mentioned  above,  the  AT 
M  scores  do  not  show  major  variations  between  longitude  bands.  The  AT  M  scores  at  all 
three  time  periods  for  the  middle  and  eastern  areas  are  similar  to  the  those  of  the  total 
NTCM  sample  (Table  3). 

The  systematic  error  measures  of  the  NTCM  (Table  9)  also  show  no  major  departures 
from  those  of  the  overall  sample  statistics  in  Table  3.  The  XX  and  XY  for  all  three 
subsamples  are  generally  less  than  70  km.  In  the  eastern  area  the  NTCM  has  small  and 
nearly  constant  eastward  zonal  (XX  *  50  km)  and  northward  meridional  (XY  »  50  km) 
errors  throughout  the  forecast  period.  In  the  middle  area,  the  errors  are  fairly  constant 
throughout  the  forecast  period  with  a  slight  southward  meridional  displacement  (XY  *  -30 
km)  and  a  monotonic  variation  from  an  eastward  (XX  =  58  km)  to  a  westward  zonal 
displacement  (XX  =  -74  km).  For  a  westward-moving  storm,  this  may  be  interpreted  as 
the  NTCM  track  starting  out  "slow"  or  east  of  the  best  track  and  "passing"  or  moving  west 
of  the  best-track  longitude  over  the  72-h  time  period.  Very  small  variations  of  the 
systematic  errors  with  forecast  period  (<  50  km)  are  observed  in  the  western  area.  This  is 
consistent  with  the  earlier  finding  that  the  CT/AT  M  scores  are  generally  lower  and 
indicates  again  that  the  NTCM  is  highly  skillful  in  the  western  area. 


The  mean  and  median  forecast  errors  (Table  9)  are  also  consistent  with  the  CT/AT  and 
systematic  error  statistics.  That  is,  the  smallest  mean  and  median  forecast  errors  for  all 

forecast  periods  are  found  in  the  western  area  and  the  highest  are  in  the  eastern  area. 

Although  the  NTCM  is  nearly  as  skillful  as  the  CLIFER  at  24  h  in  the  eastern  area,  the 

CL1PER  generally  outperforms  the  NTCM  at  12  and  24  h.  In  addition,  the  CLIPER 

forecast  errors  are  almost  as  low  or  lower  than  the  NTCM  at  all  forecast  periods  in  the 

middle  area.  The  NTCM  outperforms  CLIPER  by  about  40  to  100  km  (both  mean  and 

median  errors)  at  36  through  72  h  in  the  western  and  eastern  areas.  In  the  western  area, 

the  48-  and  72-h  NTCM  median  forecast  errors  are  93  and  184  km  lower  than  those  of  the 

CLIPER. 

In  summary,  the  NTCM  performs  better  in  terms  of  all  of  the  error  statistics  for  storms 
with  initial  longitudes  west  of  129'  E.  One  explanation  may  be  that  the  western  area 
storms  are  closer  to  the  relatively  data-rich  continental  areas  (Fig.  6)  compared  to  the 
data-sparse  eastern  regions.  Thus,  the  initial  wind  fields  in  the  NTCM  are  more  likely  to 
be  representative  of  the  true  wind  fields.  The  frequency  of  storm  Fix  positions  also 
increases  in  this  area  because  of  the  proximity  to  land-based  radar  and  synoptic  data,  which 
provides  a  better  initial  position  for  the  NTCM. 

D.  INTENSITY  EFFECTS 

As  indicated  in  Fig.  lc,  the  sample  of  NTCM  forecasts  is  divided  into  storms  with 
initial  intensity  <  50  kt,  between  50  and  75  kt  and  S  80  kt.  These  groups  will  be  referred 
to  as  the  weak,  moderate  and  intense  subsamples,  respectively.  Recall  that  the  initial 
intensity  of  the  bogus  storm  in  the  NTCM  is  always  60  kt,  which  is  near  the  mean  of  the 
moderate  subsample. 

The  M  scores  for  both  CT  and  AT  errors  are  relatively  low  for  the  moderate  subsample 
(Table  1 1).  In  fact,  the  M  scores  for  both  CT  and  AT  at  every  forecast  period  (24, 48  and 
72  h)  are  considerably  smaller  for  the  moderate  subsample  than  those  for  the  other  two 
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subsamples.  Nearly  all  of  the  M  scores  in  the  moderate  subsample  are  at  least  ten  points 
better  than  the  M  scores  from  the  total  sample  (Table  3).  An  exception  is  the  72-h  AT  M 
score,  which  is  53.8  for  the  moderate  subsample  and  58.1  for  the  total  sample.  The  M 
scores  of  the  weak  subsample  are  generally  the  highest  of  the  three  subsamples.  A 
possible  explanation  is  that  the  deep  tropospheric  bogus  storm  in  the  NTCM  is  not  a  good 
representation  of  these  weak  storms.  The  M  scores  for  the  intense  subsample  are  closer  to 
the  total  sample  scores  (Table  3),  but  higher  than  the  48-  and  72-h  CT  cases. 

The  contingency  tables  for  intensity  stratifications  (Tables  A- 13  to  A- 18)  provide 
further  explanation  of  the  M  scores.  Notice  that  for  all  three  forecast  periods,  the  NTCM 
CT  errors  are  biased  to  the  right  of  the  best  track  for  the  weak  group,  are  fairly  evenly 
distributed  about  the  best  track  for  the  moderate  subsample,  and  are  typically  to  the  left  of 
the  best  track  for  the  intense  subsample.  These  results  suggest  that  the  NTCM  may  predict 
recurvature  too  quickly  for  the  less  intense  storms  and  may  be  slow  in  recurving  storms 
with  intensity  >  80  kt.  The  60-kt  bogus  storm  may  result  in  excessive  poleward  deflecting 
of  the  weak  storms  that  are  expected  to  be  traveling  from  east  to  west  By  contrast,  the 
poleward  deflection  may  be  underestimated  by  the  bogus  storm  in  the  NTCM  when  the 
storm  is  actually  more  intense.  This  is  especially  true  for  right-moving  storms  (relative  to 
CLIPER)  at  72  h,  when  the  NTCM  tends  to  forecast  a  left-moving  path  (two-class  error)  in 
40.6%  of  the  cases. 

The  AT  M  scores  (Table  1 1)  are  also  lower  for  the  moderate  subsample,  although  at 
72  h,  they  are  not  much  lower  than  that  of  the  intense  subsample  (53.8  versus  57.3, 
respectively).  The  high  percentage  of  two-class  errors  in  the  fast  category  of  the  24-,  48- 
and  72-h  AT  contingency  tables  (Tables  A- 16,  A- 17  and  A- 18)  indicate  a  slow  bias  in  each 
subsample.  For  the  weak  subsample,  a  high  percentage  of  two-class  errors  occurs  at  all  the 
three  forecast  intervals,  especialy  at  24  h  (50%).  Although  this  slow  bias  is  less  prevalent 
in  the  intense  subsample,  the  AT  M  scores  are  higher  than  those  of  the  moderate  group  at 
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each  time  interval.  The  lower  M  scores  in  the  moderate  subsample  are  due  to  the  lower 
number  of  one-class  errors,  even  though  at  72  h  there  is  a  high  percentage  (31.1%)  of 
two-class  errors  in  which  the  NTCM  is  slower  than  the  best  track  (Table  A- 18). 

The  systematic  errors  for  the  NTCM  (Table  12)  indicate  that  there  is  little  or  no 
systematic  growth  in  longitudinal  (ZX)  errors  in  the  moderate  and  weak  subsamples.  The 
NTCM  position  in  both  cases  is  east  (73  km  and  50  km  for  the  weak  and  moderate 
subsamples,  respectively)  of  the  average  best- track  position  at  12  h  and  remains  almost 
constant  with  increasing  time.  However,  a  large  systematic  growth  in  longitudinal  error 
occurs  in  the  intense  subsample.  The  zonal  error  (ZX)  increases  from  16  to  -150  km 
monotonically  with  time,  which  indicates  that  the  average  NTCM  position  becomes  farther 
west  of  the  best  track  with  increasing  forecast  period  for  those  intense  storms.  Only  a 
small  meridional  error  (ZY)  is  found  for  the  different  storm  intensities.  The  72-h  NTCM 
forecasts  are  slightly  to  the  south  of  the  best  track  for  the  weak  and  moderate  subsamples 
and  slightly  to  the  north  in  the  moderate  subsample. 

Forecast  errors  of  the  NTCM  in  the  moderate  subsample  are  much  smaller  than  those 
of  CLIPER  beyond  12  h  (Tables  12  and  13).  The  NTCM  mean  and  median  forecast  errors 
in  the  intense  subsample  are  about  the  same  as  in  the  moderate  subsample,  even  though  the 
CT  and  AT  results  seem  to  indicate  much  lower  directional  and  speed  errors  for  the 
moderate  subsample.  A  possible  explanation  for  this  result  is  that  the  accuracy  of  the  initial 
position  from  fixes  by  any  platform  (aircraft,  satellite  or  radar)  is  much  greater  for  cyclones 
that  have  developed  an  eye  (or  at  least  a  well-defined  circulation  center).  Since  initial 
position  errors  are  propagated  along  the  forecast  track,  the  NTCM  mean  and  median 
forecast  errors  for  the  intense  subsample  should  be  smaller  than  those  of  the  weak  or 
moderate  subsamples  by  virtue  of  better  initial  position  inputs.  The  CLIPER  (which 
should  be  unbiased  with  respect  to  storm-related  parameters)  mean  and  median  forecast 
errors  also  decrease  markedly  from  weak  to  intense  subsamples  (Table  13),  which 


supports  this  argument  In  addition,  the  CT  and  AT  M  scores  indicate  that  the  NTCM 
predicts  the  storm  direction  and  speed  much  more  accurately  for  moderate  storms  than  for 
either  weak  or  intense  storms.  Finally,  the  weak  subsample  has  much  larger  mean  and 
median  forecast  errors  (as  well  as  higher  CT  and  AT  M  scores)  than  the  other  subsamples 
throughout  the  entire  forecast  period.  Thus,  the  60- kt  specification  of  the  NTCM  storm 
bogus  may  be  inappropriate  for  weak  storms. 

E.  PAST  12-HOUR  INTENSITY  CHANGE  EFFECTS 

The  three  subsamples  of  NTCM  forecasts  are  classified  as  weakening  (past  12-h 
intensity  change,  or  "A  intensity"  £  0  kt),  intensifying  (A  intensity  5  and  10  kt)  and  rapidly 
intensifying  (A  intensity  ^  15  kt).  As  indicated  earlier,  the  number  of  forecasts  (Table  14) 
is  not  equally  distributed  among  the  three  categories  due  to  the  small  range  of  possible 
A-intensity  values. 

The  NTCM  CT  M  scores  are  the  lowest  for  the  rapidly  intensifying  storms  (Table  14) 
at  all  forecast  periods,  although  the  intensifying  storms  had  CT  M  scores  almost  as  low  at 
72  h.  The  AT  M  scores  for  the  rapid  intensifies  were  much  lower  (more  than  10  points  at 
all  three  forecast  periods)  than  those  of  the  weakening  storms.  These  results  indicate  that 
the  NTCM  forecasts  direction  and  speed  more  accurately  for  storms  that  are  intensifying 
(slowly  or  rapidly)  than  for  weakening  storms. 

The  NTCM  mean  and  median  forecast  errors  (Table  15)  follow  the  same  pattern  as  the 
CT  and  AT  M  scores.  That  is,  the  errors  for  the  rapidly  intensifying  storms  are  much 
smaller  than  those  of  the  weakening  storms  (more  than  100  km  smaller  mean  and  median 
errors  at  72  h).  The  trend  of  decreasing  mean  and  median  forecast  errors  from  weakening 
to  intensifying  to  rapidly  intensifying  subsamples  holds  for  all  forecast  periods  except 
between  12  and  36  h.  For  these  periods,  the  median  forecast  errors  increase  slightly  for 
the  intensifying  storms,  and  then  decrease  for  the  rapid  intensifiers  (Table  15). 


The  meridional  (£Y)  errors  (Table  15)  for  all  three  categories  had  small  values,  which 
indicates  that  no  north-south  systematic  errors  exist  in  the  three  subsamples.  As  the  zonal 
(IX)  errors  for  the  intensifying  storms  decrease  nearly  linearly  from  24  h  (46  km)  to  72  h 
(-54  km),  the  NTCM  position  is  initially  east  of  the  best-track  longitude  ("slow"  for  east  to 
west-moving  storms),  and  becomes  west  of  the  best  track  by  72  h.  This  may  be  a  function 
of  the  initial  slow  bias  of  the  NTCM,  which  would  cause  the  point  of  recurvature  to  be 
forecast  too  late  (Sandgathe,  1985).  By  contrast,  the  rapidly  intensifying  storms  have  a 
small  and  nearly  constant  (from  61  to  34  km)  zonal  bias.  In  this  case,  the  initial  slow  bias 
in  the  NTCM  forecasts  is  carried  throughout  the  forecast  period.  A  statistical  scheme  to 
remove  the  initial  slow  bias  of  the  NTCM  should  result  in  a  reduction  in  errors. 

The  CLIPER  mean,  and  especially  the  median  forecast  errors  (Table  16)  have  smaller 
differences  among  the  three  categories.  For  example,  the  median  forecast  errors  at  72  h  are 
605,  595  and  616  km  for  the  weakening,  intensifying  and  rapidly  intensifying  storms. 
The  relatively  small  differences  in  forecast  errors  between  categories  is  seen  at  the  12- 
through  60- h  forecast  periods  as  well.  In  addition,  the  mean  and  median  forecast  errors 
for  each  category  are  within  35  km  of  the  total  error  statistics  (Table  4)  at  every  time  period 
except  72  h,  when  the  mean  forecast  error  for  the  weakening  category  is  52  km  larger  than 
the  total  sample  mean.  This  result  indicates  that  the  CLIPER  forecasts  are  not  affected  by 
changes  in  the  past  12-h  intensity  trend. 

Compared  to  the  CLIPER  errors,  the  NTCM  error  statistics  all  indicate  that  the  NTCM 
has  much  more  skill  at  the  36-  to  72-h  periods  for  both  intensifying  and  rapidly 
intensifying  categories.  For  example,  the  NTCM  median  and  mean  forecast  errors  at  72  h 
are  142  and  127  km  lower  than  the  CLIPER  in  the  rapidly  intensifying  category.  On  the 
other  hand,  the  median  72-h  forecast  error  for  the  NTCM  is  28  km  higher  than  the 
CLIPER  for  the  weakening  category.  Since  the  NTCM  mean  forecast  error  at  72  h  for 
weakening  storms  is  120  km  smaller  than  the  CLIPER  error,  the  NTCM  evidently  has 
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fewer  very  large  errors  in  its  forecasts  compared  to  CLIPER,  which  has  a  slightly  lower 
median  forecast  error  at  72  h. 

In  summary,  each  of  the  error  measures  suggests  that  the  NTCM  is  much  more  skillful 
in  forecasting  intensifying  storms  (both  slow  and  rapid)  than  weakening  storms.  The 
marked  difference  between  rapid  intensifies  and  weakening  storms  in  both  CT/AT  M 
scores  and  mean/median  forecast  errors  suggest  that  the  performance  of  the  NTCM  is 
significantly  affected  by  the  past  12-h  intensity  trend  as  well  as  the  initial  intensity. 


F.  SIZE  EFFECTS 

The  sample  of  NTCM  forecasts  is  divided  by  the  initial  size  (radius  of  30-kt  winds) 
into  categories  of  "small"  (size  £  100  n.mi),  "medium"  (size  105  to  205  n.mi)  and  "large" 
(size  t  210  n.mi).  Although  the  AT  M  scores  (Table  17)  do  not  vary  much  between 
categories,  they  are  the  lowest  in  the  large  category.  In  fact,  these  scores  among  the  three 
categories  vary  by  only  four  points  at  72  h  and  10  points  at  the  48  h.  This  suggests  that 
the  initial  size  parameter  has  a  diminishing  effect  with  time  on  the  speed  forecast  (AT 
component)  of  the  NTCM. 

The  lowest  CT  M  scores  for  the  NTCM  are  found  in  the  small  category,  where  the 
72-h  M  score  is  more  than  10  points  lower  than  either  the  medium  or  large  categories 
(Table  17).  Notice  that  the  largest  percentages  of  two-class  CT  errors  at  the  48  and  72  h 
time  periods  occur  in  the  large  subsample.  Inspection  of  the  48-  and  72-h  CT  contingency 
tables  (Tables  A-26  and  A-27)  reveals  that  a  very  large  number  of  one-  and  two-class 
errors  are  located  in  the  lower  left  bins  of  the  large  (size  >210  n.mi)  subsample.  A 
majority  of  the  forecasts  in  the  lower  left  bin  of  the  contingency  table  indicates  that  the 
NTCM  forecast  track  falls  far  to  the  left  of  the  best  track  more  frequently  than  it  does  to  the 
right  of  the  track  (68  left  versus  28  right  at  48  h,  and  71  left  versus  24  right  at  72  h). 
Therefore,  the  larger  the  storm,  the  more  often  the  NTCM  forecasts  the  track  to  be  to  the 
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left  of  the  best  track.  A  possible  explanation  of  this  bias  to  the  left  of  the  best  track  is  that 
the  NTCM  tends  to  forecast  straight  tracks  for  large  (probably  recurving)  storms.  As  a 
westward-moving  storm  begins  to  turn  to  the  northwest,  a  straight  forecast  would  produce 
large  negative  (left)  CT  components.  In  addition,  a  forecast  that  recurves  the  storm  too  late 
will  also  produce  negative  CT  components.  This  was  observed  in  the  case  of  Typhoon 
Abby  during  1983,  which  began  to  recurve  around  the  western  periphery  of  the  subtropical 
ridge  soon  after  it  formed.  Although  the  NTCM  (as  well  as  the  other  objective  aids) 
continually  forecast  Abby  to  move  west-northwest,  this  storm  produced  some  of  the 
largest  forecast  errors  in  this  data  set  and  many  of  the  left  of  track  one-  and  two-class  CT 
errors  in  the  large  category  (Tables  A-25  through  A-27). 

The  mean  and  median  forecast  errors  of  the  NTCM  (Table  18)  seem  to  contradict  the 
above  findings.  That  is,  the  mean  and  median  forecast  errors  are  largest  for  the  small 
category  and  decrease  from  the  small  to  large  categories  (this  applies  to  CLIPER  as  well). 
However,  the  mean  and  median  forecast  errors  do  not  vary  much  among  the  three 
categories  (90  km  or  less  at  all  time  periods)  compared  to  the  differences  found  between 
categories  of  the  other  storm-related  parameters.  The  lower  forecast  errors  for  the  large 
category  may  be  due  to  more  accurate  initial  positions  and  working-best-tracks  for  the  large 
storms.  This  reasoning  assumes  that  the  fix  accuracy  for  very  large  (or  intense)  tropical 
cyclones  is  higher  than  for  small  systems  due  to  better-defined  central  features.  While 
there  are  cases  of  intense  storms  that  have  very  small  radii  of  30-kt  winds,  it  is  generally 
held  that  the  size  of  tropical  cyclones  generally  increases  with  intensity.  Thus,  smaller 
errors  in  initial  position  result  in  smaller  errors  propagated  along  the  forecast  track.  In 
addition,  the  frequency  of  fixes  is  higher  for  very  large  or  intense  storms  because  the 
JTWC  places  higher  priority  on  tasking  satellite  coverage  and  aircraft  reconnaissance  for 
such  potentially  destructive  systems.  Because  of  resource  limitations  less  threatening 
storms  often  receive  less  coverage  in  terms  of  fix  data  during  multiple-storm  situations. 


Only  the  zonal  (XX)  errors  in  the  large  categoxy  (Table  18)  show  a  systematic  change 
with  forecast  period  from  24  km  east  of  the  best  track  to  121  km  west  of  best  track.  As 
described  above,  this  increase  in  the  zonal  error  is  interpreted  as  a  NTCM  forecast  track 
continuing  westward  while  the  storm  is  tending  to  recurve  to  the  north.  The  zonal  errors 
for  the  small  and  medium  sizes  tend  to  be  large  from  the  initial  time  and  do  not 
systematically  grow,  which  suggests  difficulties  with  initializing  the  NTCM.  The 
meridional  (XY)  errors  for  the  small  storms  (Table  18)  have  a  very  small  systematic  trend 
from  north  (8  km)  to  south  (-41  km)  of  the  best  track  position,  but  no  systematic  change 
for  the  medium  and  large  storms. 

The  CLIPER  mean  and  median  forecast  errors  (Table  19)  also  indicate  distinctly 
smaller  forecast  errors  for  the  large  category.  The  mean  and  median  CLIPER  errors  at 
72  h  for  the  large  storms  are  200  and  164  km  smaller  than  those  for  the  medium  storms. 
This  sensitivity  of  the  CLIPER  to  the  size  parameter  may  also  be  traced  in  part  to  smaller 
initial  positioning  errors.  Notice  that  the  NTCM  forecast  errors  at  72  h  are  slightly  larger 
than  the  CLIPER  errors  for  the  large  category.  By  contrast,  the  NTCM  mean  and  median 
forecast  errors  at  72  h  for  the  small  and  medium  categories  are  smaller  than  the  CLIPER 
errors  by  at  least  98  km  (Tables  18  and  19)  at  all  forecast  intervals.  This  suggests  that  the 
NTCM  shows  a  higher  skill  level  for  small  and  medium  storms  than  for  large  storms 
relative  to  CLIPER,  even  though  the  actual  error  magnitudes  are  smaller  for  the  large 
storms. 

In  summary,  the  CT  M  scores  and  contingency  tables  indicate  that  the  NTCM  forecast 
tracks  for  large  storms  are  left  of  the  best  track  much  more  often  than  they  are  to  the  right. 
In  addition,  the  NTCM  has  slightly  higher  forecast  errors  at  72  h  for  large  storms  than  the 
CLIPER,  which  indicates  that  the  NTCM  has  little  skill  in  this  category.  Although  the 
forecast  errors  are  slighdy  larger  for  the  small  and  medium  storms,  they  are  much  smaller 
than  the  CLIPER  errors,  which  indicates  a  higher  level  of  skill.  In  addition,  there  is  a  large 


systematic  decrease  in  the  zonal  (LX)  component  for  large  storms,  so  that  the  NTCM 
forecast  becomes  farther  west  of  the  best  track  with  forecast  period. 


It  should  be  noted  that  the  radius  of  30-kt  winds  may  not  be  an  accurate  representation 
of  the  size.  The  infrequency  of  wind  Held  measurements  make  this  storm-related 
parameter  the  most  subjective  of  the  five.  In  many  cases,  aircraft  peripheral  data  or 
synoptic  data  from  ships  or  islands  close  to  the  storm  are  not  available,  and  the  TDO  must 
extrapolate  the  size  from  the  most  recent  data  available,  or  estimate  the  size  from  satellite 
imagery.  An  objective  method  for  determining  storm  size  would  be  desirable  to  facilitate 
the  use  of  such  data  in  future  studies. 


Various  error  statistics  for  evaluating  the  effects  of  storm-related  parameters  on  the 
NTCM  are  applied  to  a  sample  of  542  NTCM  forecasts  during  1981-1983.  A  new 
technique  for  computing  the  cross-track  (CT)  and  along-track  (AT)  error  components 
relative  to  CLIPER  forecast  positions  is  found  to  be  very  effective  for  evaluating  the  errors 
in  a  storm-oriented  frame  of  reference.  The  best-track  CT  components  at  each  forecast 
period  are  distributed  normally  about  the  respective  extrapolated  CLIPER  tracks.  The 
NTCM  CT  and  AT  errors  are  related  to  true  storm  movement  (left  or  right,  and  slow  or 
fast)  by  comparison  in  contingency  tables  with  the  verifying  best-track  positions.  An  M 
score  is  used  to  distill  the  information  from  each  contingency  table  into  a  single  penalty 
score.  The  mean  and  median  forecast  errors  and  the  systematic  errors  are  also  calculated. 
The  statistics  of  the  total  sample  (1981  through  1983)  for  the  western  North  Pacific 
indicate  a  slow  bias  in  the  NTCM  forecasts,  especially  at  the  early  (12  to  36  h)  forecast 
periods. 

The  NTCM  forecasts  are  evaluated  within  terciles  for  five  initial  storm-related 
parameters  (latitude,  longitude,  intensity,  intensity  trend  and  size).  For  storms  with  initial 
latitudes  south  of  13°  N,  the  NTCM  predicts  the  direction  and  speed  of  storms  much  better 
than  for  storms  north  of  13*  N.  The  forecast  errors  are  lower  for  the  southern  storms  as 
well.  By  contrast,  the  NTCM  performs  relatively  poorly  at  72  h  for  storms  with  initial 
latitudes  north  of  17*  N.  The  CT  errors  for  the  northern  storms  were  especially  large  at  48 
and  72  h.  The  systematic  errors  and  contingency  tables  indicate  that  the  NTCM  has  a  large 
westward  and  left-of-track  bias,  which  suggests  that  the  NTCM  is  slow  in  forecasting 
recurvature  for  storms  in  the  northern  area.  The  NTCM  performs  better  for  storms  with 
initial  longitudes  west  of  129*  E.  Low  CT  M  scores  (only  43.8  at  48  h)  and  forecast  errors 
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for  the  NTCM  in  this  region  are  thought  to  be  a  function  of  the  data  availability  of  the 
western  area  relative  to  the  areas  farther  east 


NTCM  forecasts  of  storms  with  initial  intensities  between  SO  and  75  kt  (moderate 
category)  are  found  to  have  much  better  CT/AT  performance  characteristics  than  weak  or 
intense  categories  of  storms.  The  CT  contingency  tables  indicate  the  NTCM  has  no  bias 
left  or  right  of  the  best  track  in  the  moderate  category,  whereas  the  weak  storms  are  more 
often  forecast  to  the  right  of  best  track  and  intense  storms  to  the  left  In  agreement  with  the 
CT/AT  statistics,  the  forecast  errors  for  the  moderate  category  are  also  relatively  small. 
The  results  support  the  expectation  that  the  NTCM  would  perform  better  on  storms  with 
initial  intensities  more  closely  resembling  that  of  the  fixed-intensity  bogus  storm.  It  is 


therefore  recommended  that  a  variable  intensity  storm  bogus  to  agree  with  the  actual 
intensity  be  evaluated  as  an  upgrade  to  the  NTCM.  The  NTCM  has  lower  CT  and  AT  M 
scores,  and  lower  forecast  errors,  for  intensifying  storms  than  for  weakening  storms.  An 
initial  slow  bias  in  the  NTCM  forecasts  tends  to  be  carried  throughout  the  forecast  period 
for  storms  in  the  rapidly  intensifying  category. 

The  radius  of  30-kt  winds  from  the  JTWC  warnings,  which  is  used  as  a  measure  of 
storm  size,  is  a  relatively  subjective  measure  because  no  objective  technique  exists  for 
estimating  the  radius  in  the  absence  of  peripheral  data.  The  NTCM  forecasts  for  very 
large  storms  are  to  the  left  of  the  best  track  much  more  often  than  to  the  right.  A  large 
systematic  decrease  with  increasing  forecast  period  of  the  zonal  (XX)  error  component  also 
suggests  that  the  NTCM  does  not  show  a  high  degree  of  skill  in  forecasting  the  recurvature 
of  large  systems.  The  NTCM  shows  no  improvement  in  the  mean  and  median  forecasts 
errors  relative  to  the  CLEPER  for  the  large  category,  despite  having  slightly  lower  errors 
than  the  small  and  medium  categories. 


VI 


These  results  provide  the  Typhoon  Duty  Officer  valuable  information  about  the  NTCM 
performance  with  respect  to  various  storm-related  parameters.  It  is  recommend  that  similar 
studies  be  conducted  to  provide  die  same  information  about  the  One-way  Tropical  Cyclone 
Model  (OTCM)  and  other  dynamic  forecast  aids.  These  results  should  also  be  used  to 
construct  of  a  decision  tree  that  will  provide  the  TDO  with  a  real-time  evaluation  of  each 
forecast  aid.  Such  a  tool  might  contribute  to  reductions  in  track  forecast  errors  of  these 
destructive  cyclones. 


APPENDIX: 

CROSS-TRACK  (CT)  AND  ALONG-TRACK  (AT) 
CONTINGENCY  TABLES 
AND 

PERCENTAGE  OF  ONE-,  TWO-  AND  THREE-CLASS  ERROR  TABLES 


Each  table  in  the  appendix  contains  three  columns  which  correspond  to  different  values 
of  a  storm-related  parameter.  Each  column  contains  a  three-by-three  contingency  table  of 
CT  or  AT  errors  on  the  top  row  and  a  table  of  the  percentage  of  of  one-,  two-  and 
three-class  errors  on  the  bottom  row.  The  contingency  tables  can  be  likened  to  a  box  with 
nine  bins  which  contain  the  CT  or  AT  error  components  of  the  NTCM  forecasts  compared 
with  the  best  track  positions.  The  forecasts  and  best-track  positions  are  first  referenced  to 
a  CLJPER  track  (either  left,  right,  center  or  slow,  fast,  center)  and  then  compared  to  each 
other  in  the  contingency  table.  If,  for  example,  an  NTCM  forecast  is  left  of  the  CLIPER 
track  and  the  best  track  is  also  left,  the  number  of  cases  in  the  upper  left  bin  of  the  CT 
contingency  table  is  increased  by  one.  This  bin  represents  a  number  of  zero-class  errors, 
as  do  the  other  bins  on  the  upper-left  to  lower-right  diagonal.  The  upper-right  and 
lower-left  bins  represent  the  number  of  two-class  errors,  and  the  remaining  bins  the 
one-class  errors.  The  percentage  of  the  class  errors  (with  respect  to  the  subsample  in  that 
column)  are  tabulated  below  the  contingency  tables.  They  show  the  percentagerof  CT  (AT) 
class  errors  that  occur  left  (slow),  center,  or  right  (fast)  of  the  best  track  as  well  as  the  total 
percentage  of  class  errors  for  the  subsample. 

The  tables  are  organized  in  the  following  order 

I.  Storm-related  parameter 

A.  Cross-track  error  components 

1.  24-h  NTCM  forecasts 

2.  48-h  NTCM  forecasts 

3.  72-h  NTCM  forecasts 

B.  Along- track  error  components 

1.  24-h  NTCM  forecasts 

2.  48-h  NTCM  forecasts 

3.  72-h  NTCM  forecasts 


TABLE  A-l 

Cross-track  contingencies,  percent  class  errors,  and  M  scores  for  24-h  NTCM  forecasts  stratified  by  latitude 


TABLE  A-2 

Same  as  A-l,  except  for  48  h 


relative  to  CLIPER) 


Track  relative  to  CLIPER) 


TABLE  A-7 

Same  as  A- 1,  except  for  cross-track  24  h  stratified  by  longitude. 


rack  relative  to  CLIPER) 


NTCM 


Best  Track 


Best  Track 


82 


Totals  47.4  47.3  5.3  f  Totals 


TABLE  A- 10 

Same  as  A-7,  except  for  along-track  24  h. 


'rack  relative  to  CLIPER) 


TABLE  A- 11 

Same  as  A- 10  except  for  48  h. 


relative  to  CL1PER) 


TABLE  A- 12 

Same  as  A- 10  except  for  72  h 


relative  to  CL1PER) 


TABLE  A- 13 

Same  as  A-l,  except  for  cross-track  24  h  stratified  by  intensity 


TABLE  A-17 

Same  as  A- 16,  except  for  48  h 


-275  to  -25  km;  .  F  >  -25  km  (Best  Track  relative  to  CLIPER) 


TABLE  A-19 

Same  as  A- 1,  except  for  stratified  by  past  12-h  intensiy  change  (A  Intensity). 

A  Intensity  <  0  kt  g  A  Intensity  5  to  10  kt  g  A  Intensity  >  15  kt 


Cut  Points:  L  <  -75  km;  C  =  -75  to  50  km;  R  >  50  km  (Best  Track  relative  to  CL1PER) 


TABLE  A-20 
A-19,  except  for  48  h 


relative  to  CLIPER) 


TABLE  A-21 

Same  as  A- 19,  except  for  72  H 


Cut  Points:  L  <  -200  km;  C  =  -200  to  200  km;  R  >  200  km  (Best  Track  relative  to  CLIPER) 


TABLE  A-22 

Same  as  A- 19,  except  for  along-track  24  h 
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TABLE  A-23 

Same  as  A-22,  except  for  48  h 


relative  to  CLIPER) 


TABLE  A-24 

Same  as  A-22,  except  for  72  h 


relative  to  CLIPER) 


TABLE  A-25 

Same  as  A-l,  except  for  cross-track  24  h  stratified  by  size  (radius  of  30-kt  winds  in  n.mi). 


TABLE  A-26 

Same  as  A- 25,  except  for  48  h 


relative  to  CLIPER) 


TABLE  A-28 

Same  as  A-25,  except  for  along-track  24  h 


Best  Track 


Best  Track 


M  Score  =  76.9  |  M  Score  =  71.8  |  M  Score  =  63.5 


to  CLIPER) 


Same  as  A-28,  except  for  72  h. 
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